Skip to content

fix: smooth temporal speaker labels#7

Merged
loookashow merged 1 commit intomainfrom
fix/temporal-speaker-smoothing
May 4, 2026
Merged

fix: smooth temporal speaker labels#7
loookashow merged 1 commit intomainfrom
fix/temporal-speaker-smoothing

Conversation

@loookashow
Copy link
Copy Markdown
Contributor

Summary

Reduces short speaker label jumps in the final diarization timeline.

  • Add centroid-scored Viterbi smoothing inside each VAD segment
  • Penalize short speaker-label switches while preserving sustained original label runs
  • Collapse brief A-B-A label islands after temporal decoding
  • Pass embeddings into timeline assembly so smoothing can use speaker centroid confidence
  • Add regression tests for flicker removal and real speaker-turn preservation
  • Refresh README/docs benchmark numbers and explain temporal smoothing

Results

VoxConverse dev benchmark:

Metric Before After
Weighted DER 5.16% 5.04%
Mean DER 6.65% 6.58%
Median DER 2.37% 2.19%
Exact speaker count 117/216 117/216
Within +/-1 speaker 175/216 175/216

Validation

  • python -m ruff check .
  • python -m ruff format --check .
  • python -m pytest tests/test_diarize.py --cov=src/diarize --cov-report=term-missing -q
  • python -m mkdocs build --strict --site-dir /private/tmp/diarize-site

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@loookashow loookashow merged commit 77a4684 into main May 4, 2026
9 checks passed
@loookashow loookashow deleted the fix/temporal-speaker-smoothing branch May 4, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants